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SYNTHESIS MJD SREENING OF LIGANDS USING X-RAY CRYSTALLOGRAPHY 

This invention relates to methods for synthesising . compounds and 
identifying from those compounds ligands that bind target 
5 macromolecules using X-ray crystallography. 



Background 

Determining the structure of proteins by X-ray crystallography is 
an elegant and reliable method and is the basis of structure- 

10 based ligand design in which small molecules are synthesized as 

potential ligands for the protein of interest. This is an intense 
area of research for the optimisation of ligands to drugs for 
therapeutically interesting proteins (see Babine, R.E. and 
Bender/ S.L., Chemical Reviews^ 97, 1359-1472 (1997) and Bohacek, 

15 R.S., et al,, Med, Res. Rev., 16, 3-50 (1996)). 

One method of ligand screening is described in WO 99/45379, in 
which a library of shape-diverse compounds thought to be 
potential ligands are soaked or co-crystallised with a target 
20 protein, and then the resulting complex is analysed by X-ray 

crystallography to determine the nature of the ligand which has 
bound. The library of compounds which is used in the screening 
process generally comprises previously characterised compounds. 

25 Summary of the Invention 

The above described method requires every potential. ligand to be 
synthesized, purified and characterized before it can form part 
of a library for screening. If the potential ligands are simply 
purchased from commercial sources, their costs will typically be 

30 high due to the amount of work required to carry out these three 
steps. If, in the alternative, the compounds are to be 
synthesized- Un-house' , then a great deal of time and effort will 
need to be expended on assembling the library of ligands for 
screening. , 

35 

The present inventors have developed a method where a collection 
of compounds is synthesized and then screened without the need 
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for any purification and/or characterisation steps. 

Accordingly, the present invention provides a method for 
identifying a ligand of a target macromolecule comprising the 
steps of: 

a) soaking one or more crystals of the target macromolecule 
in a solution containing a collection of compounds 
generated in situ or separate from the crystal, where the 
solution has been prepared without the purification, and 
preferably without the characterisation, of the 
synthesized collection of compounds; 

b) obtaining an X-ray crystal diffraction pattern of the 
soaked macromolecule crystal; and 

c) using said X-ray crystal diffraction pattern to identify 
any compound bound to the macromolecule crystal, said 
compound being a ligand of the target macromolecule. 

Any solvated crystal system in which the solvent and/or ligand 
molecules are able to infiltrate throughout the crystal via 
diffusion, and where the crystal system is compatible with X-ray 
diffraction data collection, is suitable for use in the 
invention. 

Examples of appropriate macromolecules are polypeptides 
(proteins), ribose nucleic acids (RNAs, ribozymes- etc) , deoxy 
ribose nucleic acids (DNAs) , and . complexes of conj^inations of, the 
three examples, e.g. ribosomes, or viruses (DNA and/or RNA- 
protein complexes) . 

A ligand is a molecule which can bind to a macromolecule. For a 
polypeptide chain (protein) , this is anything that is not coded 
for by the DNA sequence of the protein. This covers the post- 
translational modification of proteins (e.g. covalent attachment 
of sugars, etc.), the covalent and non-covalent attachment of 
cofactors (e.g. Haem groups), the binding of other polypeptides 
or amino acids, the binding of small molecules (e.g. drugs, 
substrates, etc.) and the binding of DNA and RNA to proteins. For 
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nucleic acids this is molecules either covalently, or non- 
covalently bound to DNA or RNA {e.g. ligands intercalated between 
bases) . 

5 In the present invention, the compounds can be synthesised by 
parallel synthesis, or (more conveniently) by combinatorial 
chemistry. Traditionally, combinatorial chemistry is used to 
generate small molecule inhibitors for screening against one or 
more biological targets. The synthesis of libraries of compounds 

10 has generally been aimed at producing either compounds as 

purified single entities or as high-quality mixtures of compounds 
using methodology that allows for deconvolution of the mixture, 
once it has been determined that an active compound is to be 
found in that mixture. Deconvolution requires the re-testing of 

15 each member compound of the active library. Thus either method 
requires the careful analysis and characterisation of the 
libraries to allow for interpretation of the data generated 
during biological screening against the target macromolecule . As 
mentioned above, the present invention does not require the. 

20 purification and/or characterisation of the members of the 

synthesised library, as the identity of the ligand is determined 
by X-ray crystallography of the ligand-macromolecule complex. 

25 The solution containing the collection of compounds can be 

prepared in two main ways, herein called ^in-situ' synthesis and 
^ just-in-time' synthesis. 

'In-situ' synthesis involves synthesizing the collection of 
30 compounds in a solution which also contains the one or more 

crystals of the target macromolecule, and therefore requires the 
use of chemistries which can be carried out under conditions in 
which the macromolecule crystal will remain stable. 

35 Accordingly, a first aspect of the present invention provides a 
method for identifying a ligand of a target macromolecule 
comprising the steps of: 
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a) synthesizing a collection of compounds, which are 
suitable for screening against a target macromolecule, in 
a solution containing one or more crystals of the target 
macromolecule; 

b) obtaining an X-ray crystal diffraction pattern of the 
soaked macromolecule crystal; and 

c) using said X-ray crystal diffraction pattern to identify 
any compound bound to the macromolecule crystal, said 
compound being a ligand of the target macromolecule. 

In this method, the synthesis of the collection of compounds will 
take place in a single reaction vessel, i.e. the vessel in which 
the solution containing one or more crystals of the target 
macromolecule is present. 

'Just-in-time' synthesis involves synthesizing the collection of 
compounds remotely from the solution which contains the one or 
more crystals of the target protein, and then transferring the 
synthesized collection into the solution containing the one or 
more crystals of the target protein. No purification and/or 
characterisation of the synthesized collection is carried out. 
The synthesis may take place in a solvent which is not compatible 
with the macromolecule crystals, from which the collection of 
compounds must be separated in order to add them to the solution 
containing the macromolecule crystals. 

Accordingly, a second aspect of the present invention provides a 
method for identifying a ligand of a target macromolecule 
comprising the steps of: 

a) synthesizing a collection of unpurified compounds 
suitable for screening against a target macromolecule 

b) adding the collection of compounds to a solution 
containing one or more crystals of the target 
macromolecule; 

c) obtaining an X-ray crystal diffraction pattern of the 
soaked macromolecule crystal; and 

d) using said X-ray crystal diffraction pattern to identify 
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any compound bound to the macromolecule crystal, said 
compound being a ligand of the target macromolecule. 

In this method, the synthesis of the collection of unpurified 
5 compounds may occur in one or more reaction vessels. 

If step a) takes place in a solvent which is not compatible with 
the macromolecule crystals then after step a) the collection of 
compounds is separated from the solvent in which the compounds 
10 were synthesised. 

Typically, such non-compatible solvents are organic and the step 
of separating the collection of compounds from this solvent is 
usually carried out by evaporating the solvent. This is then 
15 followed by re-dissolution of the collection of compounds in the 
solution containing the one or more macromolecule crystals. 

Enzyme catalysis in organic solvents has attracted much interest 
in recent years (Mattos C. and Ringe D., Curr Opin Struct 

20 Biol., 11 (6), 761-4) and the use of enzymes in non-aqueous media 

has extended the field of biocatalysis (ASGSB Bull., 4 (2), 125-132 
(1991)). Much work has been done to map out organic binding sites 
in crystals by soaking the crystals in organic solvents (English, 
A.C., et al., Proteins, 31, 628-640 (1999); Mattos C. and Ringe 

25 D., Wauret Blotechnol., 14(5), 595-9 (1996)). In addition it has 
been shown that enzyme crystals can retain activity in organic 
solvents, both in the presence and absence of crosslinking agents 
(Ayala, M., et ai., Biochem. Biophys. Res. Comm., 295(4), 828-31 
9 (2002) ) . 

30 

An alternative approach is to separate the non-compatible solvent 
from the solution containing the one or more macromolecule 
crystals by a permeable membrane, which allows transfer of the 
compounds in the collection from the non-compatible solvent to 
35 the solution containing the one or more macromolecule crystals. 

This approach requires a membrane which is porous enough to allow 
the diffusion of the synthesised compounds from the solvent in 
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which they were synthesised to the solution containing the one or 
more macroxnolecules, whilst substantially preventing any 
diffusion of the solvents. Dialysis buttons provide one means by 
which this can occur, and are available, for example, from 
Hampton Research. Their use is described in 'Crystallisation of 
Nucleic Acids and Proteins, edited by Ducruix, A. and Giege, R., 
The Practical Approach Series, Oxford University Press, 1992. 

The ligands identified by the methods of the present invention 
may be subsequently modified to alter their binding to the target 
macromolecule or to improve their usefulness as a pharmaceutical. 

Such modification is conventional in the art. Possible 
modifications include: substitution or removal of groups 
containing residues which interact with the target macromolecule, 
for example groups which interact with the amino acid side chain 
groups of a protein; the addition or removal of groups in order 
to decrease or increase the charge of a group in a compound; the 
replacement of a charge group with a group of the opposite 
charge; or the replacement of a hydrophobic group with a 
hydrophilic group or vice versa. Additionally, a group may be 
replaced with another retaining similar properties but that 
j^etter occupies the cavity in the macromolecule increasing the 
surface of the ligand in contact with the macromolecule cavity. 
This may be achieved using the methodologies disclosed in this 
invention, or by conventional synthetic approaches typically 
utilised by those skilled, in the art of medicinal chemistry. 
Many of these changes will improve the usefulness of a compounds 
as a pharmaceutical. It will be understood that these are only 
examples of the type of substitutions considered by medicinal 
chemists in the development of new pharmaceutical compounds and 
other modifications may be made, depending upon the nature of the 
starting compound and its activity. 

Without wishing to be bound by theory, the detection of the 
ligand bound to the target macromolecule relies on the occupancy 
in the macromolecule crystals of one of the highest affinity 
ligands, this being driven by ligand-macromolecule interactions. 
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This method avoids disadvantages associated with biological 
screening methods in which the alteration of macromolecule 
5 activity by a potential ligand is assessed, as in that case 
compounds which bind weakly but non-specif ically can alter 
macromolecule activity in a non-selective manner. Such non- 
selective inhibition produces false-positives, as the assay shows 
protein activity inhibition, but the compound would perform no 
10 useful function as a drug, as it would interfere with the 

activity of other proteins- Only compounds bound in a binding 
site will be detected by the present method. In particular, only 
compounds bound in a binding site with resolvable occupancy will 
be detected by the present method. 

15 

Binding sites are sites within a macromolecule, or on its 
surface, at which ligands can bind. Examples are the catalytic or 
active site of an enzyme (the site on an enzyme at which the 
amino acid residues involved in catalysing the enzymatic reaction 

20 are located), allosteric binding sites (ligand binding sites 
distinct from the catalytic site, but which can modulate 
enzymatic activity upon ligand binding) , cof actor binding sites 
(sites involved in binding/co-ordinating cofactors e.g. metal 
ions), or substrate binding sites (the ligand binding sites on a 

25 protein at which the substrates for the enzymatic reaction bind) . 
There are also sites of protein-protein interaction. If the 
macromolecule is a nucleic acid, then binding sites may be the 
bases of the nucleic acid, or spaces in their structures, e.g. 
the major or minor grooves in the helical DNA, interactions with 

30 phosphate, ribose or deoxy ribose groups or intercalated between 
the bases. 

The present method also enables screening where the target 
macromolecule has more than one active site, as the data for each 
35 site can be analysed independently from the other sites to 

determine the compound bound in that site. In such cases, the 
information the method of the invention provides on the binding 
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of two or more separate ligands to the target macromolecule can 
be used in the linked-f ragment approach to drug design, in a 
similar manner to the method described by Greer, et al., J, Med. 
Chem., 37(8), 1035-1054 (1994) for the synthesis of a thymidylate 
synthase inhibitor series. The basic concept behind linked- 
fragment approaches to drug design is to determine 
(computationally or experimentally) the binding location of 
plural ligands to a target molecule, and then to construct a 
molecular scaffold to connect the ligands together in such a way 
that their relative binding positions is preserved. The methods 
of synthesis and screening of the present invention may then be 
used to determine the best ligands from a library of such 
compounds, or individual compounds binding ability can be 
assessed using known methods. 

Even if the present invention only provides information on the 
binding of ligands at a single binding site of a target 
macromolecule, a structure-based approach can be used to develop 
ligands which interact with further binding sites- Such a 
fragment growing approach is described in Blundell, T., et al.. 
Nature Reviews Drug Discovery, vol. 11, 45-54 (2002). 

It is preferred that the members of the collection of compounds 
are present at a concentration of at least 5 to 50 times, 
typically at least 10, their Ki (depending case by case with the 
macromolecule used) so that the occupation of the binding site in 
the target macromolecule will not depend on the relative 
quantities of each compound in the collection. 

In the case of competitive binding of a ligand to a macromolecule 
Ki is defined as: 

_ [M][Z] 
' " [ML] 

where [M\ is the concentration of the macromolecule, [L] is the 
concentration of free ligand, [ML\ is the concentration of the 
ligand-macromolecule complex . 
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Where inhibitors are binding in an uncompetitive or non- 
competitive fashion the Ki is defined as in Fundamentals of Enzyme 
Kinetics by A. Cornish-Bowden, Portland Press, 1995/ ISBN 1 85579 
0720, which is herein incorporated by reference. 

It is preferred that the amount of each compound, being a member 
of the collection of compounds, present in the solution will be 
present at a concentration which is at least 5 or 10 times as 
much as the concentration of the target macromolecule in the 
reaction system, more preferably 100, 1 000 or even 10 000 times 
the concentration of the target macromolecule in the reaction 
system. 

The binding of the ligands to the target macromolecule may be 
through non-covalent interactions or covalent bonding. If the 
target macromolecule is a protein, then covalent binding of the 
ligand to the protein may occur when the active site of the 
protein contains a catalytic residue such as in serine and 
cysteine proteases. If the target macromolecule is a nucleic 
acid, then certain classes of compounds are known to interact by 
covalent binding, e.g. pyrrolobenzodiazepines covalently bind to 
the exocyclic amino group of guanine. 

In some embodiments of the present invention, it is preferred 
that the members of the collection of compounds do not bind 
covalently to the target macromolecule, but that they interact 
through non-coyalent binding. 

Further aspects of the invention relate to any novel compounds 
disclosed herein, their use as pharmaceuticals and their use in 
methods of therapy. In particular, further aspects of the 
invention include : 

a) a ligand identified by the method of the present invention, 
or salts, solvates and chemically protected forms thereof; 

b) a pharmaceutical composition comprising a ligand identified 
by the method of the present invention, or salts, solvates and 
chemically protected forms thereof, and a pharmaceutically 
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acceptable carrier or diluent; 

c) the use of a ligand identified by the method of the present 
invention, or salts, solvates and protected forms thereof, in a 
method of treatment of the human or animal body; 

d) the use of a ligand identified by the method of the present 
invention, or salts, solvates and protected forms thereof, in the 
manufacture of a medicament for the treatment of a disease 
ameliorated by the binding of a ligand to the target 
macromolecule used in the method of the invention; and 

e) a method for the treatment of a disease ameliorated by the 
binding of a ligand to the target macromolecule used in the 
method of the invention comprising administering to a subject 
suffering from said disease a therapeutically-ef f ective amount of 
a ligand identified by the method of the present invention, or 
salts and solvates. 

suitable carriers and diluents and information on pharmaceutical 
compositions can be found in standard pharmaceutical texts, for 
example, Handbook of Pharmaceutical Additives, 2nd Edition {eds 
M. Ash and I. Ash), 2001 (Synapse Information Resources, Inc., 
Endicott, New York, USA); Remington's Pharmaceutical Sciences, 
20th Edition, pub. Lippincott, Williams S Wilkins, 2000; and 
Handbook of Pharmaceutical Excipients, 2nd edition, 1994. 

Further details of the invention will now be presented by way of 
explanation and example. 

Although the discussion below focuses on the purification, 
crystal growth. X-ray crystallography and determination of ligand 
structure when the macromolecule is a protein, the techniques 
described are, in general, applicable to other macromolecules, 
such as nucleic acids and complexes, with appropriate 
modifications as known to the person skilled in the art. 

Target Protein Purification 

A specific target protein can be isolated from animal, plant, or 
bacterial sources directly, or via recombinant methods. The 
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generation of recombinant protein, using systems such as insect 
cells (e.g. s. frugiperda, or Drosophlla cells), E. coli, yeast 
(S. cerevisiae r S. pombe ^ P. Pastoris ^ etc) or modified human 
cell lines, means that truncated, or otherwise genetically 
5 engineered, proteins can be generated. A protein crystallography 
project to obtain crystals normally necessitates access to a 
recombinant protein production system, but the method of the 
present invention may be performed with a single crystal^ which 
may constitute, for example, between 0.1 and 100 pg. 

10 

It is generally accepted that the higher the degree of purity and 
homogeneity of a protein preparation the easier that it will be 
to grow protein crystals from the preparation. Protein purity 
reflects the number of protein species within a preparation. It 

15 also refers to the number, and nature, of any other non-protein 
species present (e.g. low molecular weight contaminants). An 
ideal protein preparation should contain solely one protein 
species, or one species of protein complex, in which all the 
protein molecules, or protein complexes, are identical in terms 

20 of their amino acid composition, mass etc. The purity of a 

protein preparation may be gauged via a variety of experimental 
techniques such as sodium dodecyl sulphate page (SDS-page) gels, 
mass spectrometry, antibody binding and detection (Western 
blotting), etc.. Protein purities in excess of 90% are often 

25 deemed acceptable for crystallisation trials, but practitioners 

of the. art of protein purification will generally try and strive . 
for purities in excess of this arbitrary threshold, due to the 
perceived benefits of maximising protein purities. 

30 Within a protein preparation, homogeneity can refer to the degree 
of uniformity observed for parameters such as the stoichiometry 
of proteins in a multiprotein complex, the mono-dispersity of the 
protein/complexes in solution, the oxidation, or protonation, 
state of amino-acid side chains, within proteins, the uniformity 

35 of post translational modifications (e.g. are all protein 

molecules within the population equivalently phosphorolated, 
glycosylated, or have any essential co-factors been uniformly and 
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correctly incorporated) and the protein conformations that exist 
within a given population of protein molecules/complexes. The 
homogeneity of a protein preparation may be probed using a 
multitude of experimental methods some of which are: mass- 
5 spectrometry, Western blotting, SDS-page, analytical 

ultracentrifugation, size-exclusion chromatography, affinity 
chromatography, ion-exchange chromatography, hydrophobic 
interaction chromatography, surface plasmon resonance, activity 
assay, electron microscopy, dynamic light scattering (DLS), N- 
10 terminal sequencing, iso-electric focussing (lEF) , proteolytic 
digest, fluorescence, circular dichroism (CD), native gel 
electrophoresis, bandshift assays, or nuclear magnetic resonance 
(NMR) . Maximising the degree of homogeneity within a protein 
preparation is again deemed desirable, as maximising homogeneity 
15 is also believed to positively correlate with maximising 
crystallisability . 

The Growth of Protein Crystals 

Crystallisation of any species requires the formation of a 
supersaturated solution of the species in question and a 
nucleation event that is capable of initiating crystal growth. 
Post-nucleation the ambient conditions must be such that crystal 
growth can be sustained until the physical dimensions and 
properties of the crystals thus obtained are adequate for any 
subsequent experimental procedures required. Protein molecules 
typically only retain their structural integrity within an 
aqueous environment. Therefore protein crystals are normally 
grown in the aqueous phase. Protein crystals may grow if a 
nucleation event occurs in a pure and homogeneous protein 
solution that has been driven to a state of super-saturation. 

Protein crystallisation is generally attempted using the vapour 
diffusion (sitting drop, hanging drop, sandwich drop, pH gradient 
etc), dialysis, batch, micro-batch, liquid-liquid diffusion, or 
35 in-gel-crystallisation methods. All these methodologies have 

been extensively described (Protein Crystallisation: Techniques, 
Strategies and Tips, Edited by T.M. Bergfors, lUL Biotechnology 
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Series, 1999, Published by International University Line, La 
Jolla, California, ISBN: 0-9636817-5-3). All of the afore 
mentioned crystallisation processes function by generating a 
supersaturated protein solution, which promotes the spontaneous 
formation of crystallisation nuclei, and which is then 
subsequently able to sustain crystal growth. 

There are diverse physico-chemical parameters that can influence 
whether or not a protein construct, or protein complex, will 
crystallise. Typically each protein crystallizes under a unique 
set of conditions, which cannot be predicted in advance. Simply 
driving the protein concentration to super-saturation, to bring 
it out of solution, will generally not work. The result would, 
in most cases, be an amorphous precipitate. Some parameters that 
may be varied are: the pH of solutions, the choice and 
concentration of buffer (if any) (e.g. Phosphate, MES, BIS-TRIS, 
TRIS, EES, PIPES, HEPES, MOPS, BICINE, CHES, CAPS etc), 
temperature, choice of crystallisation method {see above), volume 
of crystallisation, protein concentration, the addition of 
reducing agents (e.g. DTT, p-mercaptoethanol) , detergents (e.g. 
decyl-p-D-maltoside, dodecyl-p-D-maltoside , ocytl-p-D- 
glucopyranoside, decanoyl-N-methylglucamide, Triton, 
octyltetraoxyethylene ether, etc.), alcohols (e.g. ethanol, 
isopropanol, methanol, 2-methyl-2, 4-pentanediol (MPD) ) , salts 
(e.g. chlorides, acetates, sulphates, phosphates, bromides, 
iodides, fluorides, nitrates, bicarbonates, chlorates, chromates, 
citrates, tartrates, cacodylates, formates, hydroxides, etc.), 
polyethylene glycols (PEGS), ethylene glycols, methoxy 
polyethylene glycols (MPEGS) , heavy atoms and ions (e.g. iron, 
copper, zinc, cobalt, manganese, nickel, tungstates, vanadates, 
sodium, magnesium, potassium, lithium, calcium, aluminium, Xenon, 
etc.), or other additives such as dimethylsulf oxide (DMSO), 
denaturants (e.g. urea, guanadinium chloride, etc.), glycerol, 
sulfabetaines, jeffamines, AMPPNP, ATP, ADP, GTP, GDP peptides, 
tertiary-butanol, amino acids, azides, DNAs, RNAs, sugars, 
lipids, drugs, etc.. 
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There are numerous crystallisation kits available {e.g. from 
Hampton Research) r which attempt to broadly sample as many 
parameters in crystallization space as possible. In many cases 
these general screens help to identify a starting point for 
crystallisations in the form of crystalline precipitates and/or 
rough, or micro-, crystals. Typically these crystals are 
unsuitable for direct diffraction analysis and require further 
optimisation. Successful crystallization can be aided by 
knowledge of a protein' s behaviour in terms of solubility, 
dependence on metal ions for correct folding or activity, 
interactions with other molecules and any other data that are 
available . 

Systematically screening such a large nvimber of parameters 
represents an extremely complex multi-dimensional search problem 
and is, as such, exceptionally difficult to perform in a 
systematic manner. Even with the advent of automated protein 
crystallisation it is often the case that crystallisation of a 
protein will require a very high degree of human input and the 
impact of intangible parameters such as serendipity, insight, and 
random error. 

If preliminary crystals have been obtained it is often necessary 
to further modify the crystallisation conditions in an attempt to 
simultaneously maximise the internal order and physical 
-dimensions of the crystals grown. Optimising these parameters is 
deemed beneficial for helping to maximise the data quality 
obtained in subsequent X-ray diffraction experiments . 
Identification of a set of initial crystallisation conditions 
reduces the potential parameter space that has to be explored, 
but crystal optimisation can still remain a time consuming and 
laborious process. Techniques such as macro- or micro-seeding 
may also aid crystal optimisation. 

Details of some of the proteins crystallised, and information on 
some of the protein crystallisation conditions identified, are 
contained, for example, within the following internet databases: 
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http : //wwwbmcd- nist . gov: 8080/bincd/bmcd.html (Gil land, G. L. , et 
al., Acta Crystallogr. , D50, 408-413 (1994)); 
http : //xray .bmc . uu. se/embo/structdb/links . html; 
http : / /www . mpibp- 

f rankf urt .mpg . de/michel/public/memprot struct .html; 
http://www.rcsb.org/pdb/ (Berman, H.M-, et al.. Nucleic Acids 
Research, 2B, '235-242 (2000)); 
http: //www.ebi .ac.uk/rasd/ 

and have been described in the following publications: 
Blundell, T., et al., "Protein Crystallography", Academic Press, 
New York (1976); McPherson, et al., "Preparation and Analysis of 
Protein Crystals" in "Preparation and Analysis of 
ProteinCrystals", John Wiley & Sons, New York (1982); Carter, et 
al., "Design of crystallization experiments and protocols. ", 
pages 47-71 in "Crystallization of Nucleic Acids and Proteins - A 
Practical Approach", (Ducruix, A. & Giege, R., eds) IRL Press, 
Oxford (1992); Ducruix, A., et al., "Methods of 
crystallization , pages 73-98 in "Crystallisation of Nucleic 
Acids and Proteins - A Practical Approach", (Ducruix, A. & Giege, 
R., eds) IRL Press, Oxford (1992); '''Protein Crystallisation: 
techniques, stratagies, and tips." lUL Biotechnology Series 
(1999), ISBN 0-9636817-5-3. 

Obtaining X-ray diffraction data from soaked protein crystals 
An X-ray diffraction experiment consists of exposing a protein 
crystal to a collimated, coherent, beam of Xrrays and recording 
the resulting X-ray diffraction pattern. A diffraction pattern 
arises from the elastic scatter of X-rays off electrons within 
the planes of atoms within a protein crystal. The mathematics 
underlying X-ray diffraction may be represented in their simplest 
form by the Bragg equation: 

nX=2dsine 

where n is an integer, X is the wavelength of the incident X- 
rays> 9 is the scattering angle of the X-rays off a given plane 
of atoms, and d is the Bragg spacing, or spacing between 
successive planes of atoms (Bragg planes) . Thus as long as X 
corresponds to atomic scales (i.e. alA) then atomic scale 
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features should be discernable within an electron density map 
calculated using the diffraction data. Typically it is considered 
that X-ray data are required to a Bragg spacing of <3 A, for a 
given X-ray wavelength, if the data from an X-ray diffraction 
experiment are to be of use in determining the atomic positions 
within a structure. During an experiment the various Bragg planes 
within a crystal will only satisfy the mathematical criteria 
necessary for diffraction at specific crystal orientations 
relative to the incident X--ray beam. So as to obtain 
diffraction data relating to as many Bragg planes as possible the 
crystal is rotated in the X-ray beam. Thus all possible plane 
orientations are explored. The angle through which a crystal 
must be rotated in order to obtain a complete set of data 
relating to a specific Bragg spacing is defined by the space 
group of the crystal and also the initial orientation of the 
crystal in the X-ray beam. The smallest Bragg spacing for which 
diffraction data are available is termed the resolution of the 
experiment. It is desirable to ma.ximise the data completeness 
for an experiment. That is, data should ideally be 100% complete 
up to the resolution limit of the experiment. 

Unfortunately the high energy of X-ray photons means that they 
cause damage to protein molecules. This is thought to be at 
least partially due to the generation of ions and free radicals 
within the crystals. Prolonged exposure of a protein containing 
crystal to an X-ray beam will thus result in a deterioration and 
decay of the proteins. This is typically manifested by a decline 
in the data resolution and quality. This problem may be 
partially circumvented by cryogenically freezing protein 
containing crystals in vitreous ice at lOOK. Freezing of the 
crystal means that any ion^ or free radical, species that are 
generated are unable to migrate through the crystal. Thus the 
longevity of the crystal in the X-ray beam is extended and the 
data quality and resolution typically improved relative to an 
unfrozen data collection. Normally protein containing crystals 
cannot be directly frozen in the solutions in which they grew 
(mother liquor) • This is because direct freezing often leads to 
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the formation of ice crystals within the mother liquor. These 
ice crystals can destroy the internal order within a protein 
crystal and thus abolish diffraction. The addition of a cryo- 
protectant to the protein' s mother liquor can, however, lead to 
the formation of vitreous ice on freezing. This should not 
destroy the internal order of a protein crystal and thus retain 
diffraction from the crystal. Normally protein containing 
crystals must be transferred from their mother liquor into a 
specially formulated cryo-protectant solution prior to freezing. 

The exact composition of the cryo-protectant solution, the 
transfer protocol, and the freezing protocol must be uniquely 
determined for each crystal system and often for each experiment. 

Determination of ligand structure 

A mathematical operation termed a Fourier transform relates the 
diffraction pattern observed from a crystal and the molecular 
structure of the protein and ligand comprising the crystal 
(Blundell, T., et al., "Protein Crystallography", Academic Press, 
New York (1976); Drenth, '"Principles of Protein X-ray 
Crystallography", Springer (1994)). A Fourier transform may be 
considered to be a summation of sine and cosine waves each with a 
defined amplitude and phase. Thus, in theory, it is possible to 
calculate the electron density associated with a protein and 
ligand structure by carrying out an inverse Fourier transform on 
the diffraction data. This requires amplitude and phase 
information to be extracted from the diffraction data. Amplitude 
information may he obtained by analysing the intensities of the 
spots within a diffraction pattern. The conventional methods for 
recording diffraction data do, however, mean that any ''"phase 
information" is lost. This phase information must be in some way 
recovered and the loss of this information represents the 
'"crystallographic phase problem". The phase information 
necessary for carrying out the inverse Fourier transform can be 
obtained via a variety of methods. 

If the structure of the unsoaked protein is already available, as 
would normally be the case, a set of theoretical amplitudes and 
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phases may be calculated using the protein model and then the 
theoretical phases combined with the experimentally derived 
amplitudes. An electron density map may then be calculated and 
the protein and ligand structure observed. Electron density maps 
5 can be calculated using programs such as those contained in the 
CCP4 computing suite (Collaborative Computational Project 4, Acta 
Crystallographica, D50, 760-763 (1994)). For map visualization 
and model building programs such as '^O" (Jones, et ai.. Acta 
Crystallographica, A47, 110-119 (1991)) or '"QUANTA" (Jones, et 
10 al., (1991) and commercially available from Accelrys, San Diego, 
California) can be used. 

An alternative approach employs (i) X-ray crystallographic 
diffraction data from the complex of ligand and protein and (ii) 
15 a three-dimensional structure of the unsoaked protein, to 
generate a difference Fourier electron density map of the 
complex. The difference Fourier electron density map may then be 
analysed to identify the ligand. 

20 Analysis of electron density maps may be aided by software, for 
example, AutoSolve® (Blundell, T., et ai,. Nature Reviews Drug 
Discovery, 11, 45-54 (2002)) or the ligand fitting module in 
QUANTA, XLIGAND (QUANTA: see above; X-LIGAND: Oldfield, T.J., 
Acta Crystallogr D Biol Crystallogr . , 57(5), 696-705 (2001)). 

25 . 

If there is no known structure of the protein then alternative 
methods for obtaining phases must be explored so as the resolve 
the structure of the unsoaked protein (Blundell, T., et ai., 
^*Protein Crystallography", Academic Press, New York (1976)). One 

30 method is multiple isomorphous replacement (MIR) . This relies on 
soaking '^heavy atom" (i.e. Platinum, Uranium, Mercury, etc) 
compounds into the crystals and observing how their incorporation 
into the crystals modifies the spot intensities observed in the 
diffraction pattern. An alternative method for obtaining phase 

35 information for a protein of unknown structure is to perform a 
multi-wavelength anomalous dispersion (MAD) experiment. This 
relies on the absorption of X-rays by electrons at certain 
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characteristic X-ray wavelengths. Anomalous scattering by atoms 
within a protein will modify the diffraction pattern obtained 
from the protein crystal- Thus if a protein contains atoms which 
are capable of anomalous scattering a diffraction dataset 
(anomalous dataset) may be collected at an X-ray wavelength at 
which this anomalous scattering is maximal. The most usual way 
to introduce anomalous scatterers into a protein is to replace 
the sulphur containing methionine amino acid residues with 
selenium containing seleno-methionine residues. This is done by 
generating recombinant protein that is isolated from cells grown 
in controlled growth media that contains seleno-methionine 
(Doublie, S,, Methods in Enzymology, 276, 523-530 (1997)). 
Selenium is capable of anomalously scattering X-rays and may thus 
be used for a MAD experiment. Another method generally available 
for the calculation of the phases necessary for the determination 
of an unknown protein structure is molecular replacement. This 
method relies upon the assumption that proteins with similar 
amino acid sequences (primary sequences) will have a similar fold 
and three-dimensional structure (tertiary structure) . Examples 
of computer programs Jcnown in the art for performing molecular 
replacement are CNX (Brunger, A.T., et al., Current Opinion in 
Structural Biology, 8(5), 606-611 (1998) and also commercially 
available from Accelerys San Diego, CA) or AMORE (Navaza, J., 
Acta Cryst., A50, 157-163 (1994)). The phase information 
obtained by one of these means, when combined with the 
experimentally obtained amplitudes from the native dataset, 
enables an electron density map of the unknown protein molecule 
to be calculated using the Fourier transform method. 

If an electron density map has been calculated for a protein of 
unknown structure then the amino acids comprising the protein 
must be fitted into the electron density for the protein. This 
is normally done manually, although high resolution data may 
enabl.e automatic model building. The process of model building 
and fitting the amino acids to the electron density can be both a 
time consuming and laborious process- Once the amino acids have 
been fitted to the electron density it is necessary to refine the 
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structure. Refinement attempts to maximise the correlation 
between the experimentally calculated electron density and the 
electron density calculated from the protein model built 
(Blundell, T., et al., "Protein Crystallography", Academic Press, 
New York (1976) and ^'Methods in Enzymology", vols. 114 & 115, 
Wyckoff, H.W., et al., eds . , Academic Press (1985)). Refinement 
also attempts to optimise the geometry and disposition of the 
atoms and amino acids within the user-constructed model of the 
protein structure. Sometimes manual re-building of the structure 
will be required to release the structure from local energetic 
minima. There are now several software packages available that 
enable an experimentalist to carry out refinement of a protein 
structure such as CNX (see above), or REFMAC (Murshudov, G,N., et 
al., Acta Crystallographica, D53, 240-255 (1997)). There are 
certain geometry and correlation diagnostics that are used to 
monitor the progress of a refinement. These diagnostic 
parameters are monitored and rebuilding/refinement continued 
until the experimenter is satisfied that the structure has been 
adequately refined. 

The atomic coordinate data of the co-complexes formed from the 
methods of the invention can be routinely accessed using computer 
programs, for example, RASMOL (Sayle, et al., TIBSr 20, 374 
(1995)), which is a publicly available computer software package, 
which allows access and analysis of atomic coordinate data for 
structure determination and/or rational drug design or : 
AstexViewer'^" which is contained in the CCP4 computing suite. 

Information from X-ray crystallography 

The information obtained by the method according to the invention 
can be used to provide much more information that the identity of 
the ligand which has bound. As the information on the ligand 
results from fitting to the electron density measured in the 
protein active site, this can allow the mode of binding and 
interactions to be ascertained, which can be useful in further 
elaboration and optimisation of the ligand- 
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The collections of compounds 

The collection of compounds for use in the present invention are 
produced by one or more synthetic processes designed to connect 
two or more sets of monomers together. Monomers are molecules 
which share a common reactivity that makes them capable of 
combining with another complementary monomer to form a larger 
compound. Each set of monomers will contain at least one monomer 
and preferably no more than 100 monomers, such that between about 
5 and about 1000 compounds are in each collection of compounds. 
It is preferred that all the compounds in a collection comprise a 
common functional group (sometimes referred to as a ^templating 
moiety' ) which is produced by the reaction of two or more 
complimentary functional groups present on the monomer sets used 
in the synthetic process. In one preferred embodiment each 
compound in a collection is related to other members of the 
collection by virtue of being synthesised from at least one 
common monomer unit. By having common features or trends in the 
structures of the compounds in the collection, it makes it 
possible to identify which moieties in the compounds (derived 
from individual monomer units) bind best to the target protein, 
even if the independent monomer itself does not have any 
detectable binding. This is achieved by observing the preference 
for binding that the macromolecule exhibits for compounds from 
the collection- 
It is further preferred that the collection of compounds will 
have shape features that differ sufficiently to allow at least 
one set of monomers used in their production to be distinguished 
from each other. This then allows determination of the chemical 
structure of the bound ligand, or to at least determine part of 
its structure. If only part of its structure can be determined, 
re-synthesis of some of the members of the collection of compound 
that contain this partial structure will be necessary and these 
compounds are soaked into the crystal as singlet experiments so 
that the chemical structure of the bound ligand or ligands can be 
determined. This type of re-synthesis approach is termed a 
chemical deconvolution and is well known to practitioners of 
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combinatorial or parallel synthesis to identify the biologically 
active member from a mixture of synthesised compounds. 

'In-situ' synthesis 

AS mentioned above, protein crystals are sensitive to the media 
in which they are grown and kept, and therefore the methods 
chosen for 'in-situ' synthesis, i.e. synthesis of the collection 
of compounds in the presence of the target protein crystals, have 
to be chosen such that they can occur under conditions which will 
not destroy the target protein crystal. If some degradation of 
the crystals occur, this should not be to such an extent that the 
X-ray diffraction results are not of sufficient quality to allow 
for identification of the ligand (see above) . 

5 Typically reactions are those which can be carried out in a 

purely aqueous media at around room temperature (4 to 30 "C), and 
at a pH of 4 to 10. 

Examples of these chemistries, which apply to both methods of 
0 combinatorial synthesis and parallel synthesis, include: 
Acetal or Ketal formation; Addition reactions; Aldol 
condensations and related condensation reactions; Allylations; 
Cycloaddition reactions; Disulfide formation; Hydrazone 
formation; Mannich reactions; Michael reactions and related 
J5 Conjugate Addition reactions; Palladium mediated reactions; 
. Reductive alkylation; Substitution reactions; and Three or Four 
Component Reactions . 

The following reactions can be carried out using synthetic 
30 procedures described in general organic chemistry texts such as 

March's Advanced Organic Chemistry (Smith, M.B- and March, J., S*^" 
edition, Wiley-lnterscience, New York, 2001) or references given 
therein. Some synthetic reactions carried out in aqueous 
conditions have recently been reviewed by Ulf Lindstroem, see 
35 "Stereoselective Organic Reactions in Water", Cheinical Jlevievs, 

102(8), 2751-2771 (2002). The practice of combinatorial chemistry 
is described in references cited in publications by Roland Dolle 
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{Journal of Combinatorial Chemistry, 4(5), 369-418 (2002); 
Journal of Combinatorial Chemistry, 3(6), 477-517 (2001); Journal 
of Combinatorial Chemistry, 2(5), 383-433 (2000); Molecular 
Diversity, 4(4), 233-256 (2000); Journal of Combinatorial 
5 Chemistry, 1(4), 235-282 (1999); Molecular Diversity, 3(4), 199- 
233 (1993)). 

These reactions take place between appropriate sets of ligand 
precursor molecules ('monomers')/ which are represented 
schematically below, with M representing a substituent group that 
is varied in each monomer set (e.g. Mi, M2, M3, etc.). Each 
reacting set of monomers can have as few as a single member, up 
to, for example, 40 members, although a maximum of about 20 
members would be more usual. In some embodiments of the invention 
it is preferred that each set of reacting monomers comprises at 
least two monomers. A typical size for the resulting collection 
of compounds would be between 5 and 1000, preferably between 5 
and 100, with a range 10 or 20 to 50 or 70 being preferred. 

In the following examples, R represents a substituent or H on a 
monomer . 

Acetal or Ketal formation 

Addition of alcohols or a diol to an aldehyde or ketone, 
catalysed by acid. The reactions are fully reversible leading to 
thermodynamic products ^ 




Addition reactions 
30 For example, addition to an epoxide under acidic conditions, or 
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addition of alcohols to enol ethers. 
R 




X = 0^ S, N or P, and where the ring may be optionally 
substituted on any available position. 

Aldol condensations and related condensation reactions 

For examples, see Kobayashi, S. and Manabe, K., Accounts of 

Chemical Research, 35(4), 209-217 (2002). 




Related condensation reactions are the Knoevenagel reaction, the 
Peterson reaction, the Perkin reaction, the Darzen' s reaction, 
Tollens' reaction, the Wittig reaction and the Thorpe reaction. 
Careful .selection, of the: monomers is required in order for these 
reactions to proceed under aqueous conditions. 

Allylatlons 

For examples, see Kobayashi, S. and Hachiya, I., Yukl Gosei 
Kagaku Kyokalshi, 53(5), 370-80 (1995). 

Cycloaddition reactions 

For .example, the Diels-Alder reaction, see Fringuelli, F-, et 
al., European Journal of Organic Chemistry, 3, 439-455 (2001). 
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Many variations exist of the above formula, including where 
heteroatoms are incorporated (e.g. aza-Diels-Alder reactions) . 
Cycloadditions to form 5 membered ring systems are also very 
5 general and an illustrative example is the cycloaddition of 

nitrile oxides with alkynes to form oxazoles which occurs at room 
temperature under very mild conditions. 




10 An example of a cycloaddition reaction that is water-tolerant is 
the '''click chemistry" described by Sharpless in Lewis, et al., 
Angew. Chem. Int. Ed., 41(6), 1053-1057 (2002). In this an azide 
and an acetylene undergo a Huisgen 1,3-dipolar cycloaddition to 
give 1,2, 3-triazoles . 



15 




R 



Disulfide formation 

Occurs reversibly under very mild conditions 




•SH + ) — SH ^ 




M. 



2 



20 
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Hydra zone formation 

Occurs reversibly under very mild conditions. 




Mannich reactions 

For examples, see Akiyama, T., et al.. Advanced Synthesis & 
Catalysis, 344(3+4), 338-347 (2002). 




Michael reactions and related Conjugate Addition reactions 
Addition of nucleophiles to a, p-unsaturated carbonyl compounds is 
another example of addition reactions suitable for use in the 
invention. 




The carbonyl can also be replaced by other electron withdrawing 
groups, such as nitro groups, see Da Silva, F. and Jones, J., 
Journal of the Brazilian Chemical Society, 12(2), 135-137 (2001). 

A related transformation is the Baylis Hillman reaction, see Yu, 
C, et al.. Journal of Organic Chemistry, 66(16), 5413-5418 
(2001) . 

Palladium mediated reactions 

Many palladium mediated reactions can be carried out in aqueous 
media, e.g. Heck, Sonogashira, Tsuji-Trost, Suzuki, Stille, see 
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Pierre Genet, J. and Savignac, M., Journal of Organometallic 
Chemistry, 576(1-2), 305-317 (1999)- A representative 
illustration is the Suzuki cross coupling reaction: 

Pd(0) 

M^(Aryl) — BOHj + MjCAryl) — Hal ^ M—M^ 

5 

Reductive alkylation 

Occurs under very mild conditions. An example carried out in the 
presence of a protein is Hochguertel, M., et ai.. Proceedings of 
the National Academy of Sciences of the United States of America, 
10 99(6), 3382-3387 (2002). 

H R' R" 

\ + 'nh ^^'^"""""^^"^ V 

5ujbsti tution reactions 

Many useful substitution reactions occur under aqueous 
15 conditions, e.g. nucleophilic displacement (with alcohols, 

amines, thiols, carboxylic acids, enolates, hydrazines, dithianes 
etc) of alkyl halides, tosylates, mesylates and azides; ester, 
amide and urea formation by displacement of an activated ester or 
carbonate or carbamate; and aromatic nucleophilic substitution of 
electron deficient aromatic compounds with amines, alcohols, 
thiols etc. 



20 



An example of alkylation chemistry in the presence of a protein 
is Nguyen, R. and Hue, I., Angew. Chem. Jnt. Ed., 40(9), 1774-6 
25 (2001) 



A generic scheme is as follows: 
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R\ R" R" R. 

LG + X ^ \-( 



y-LG + \ ^ W 

Where LG is a leaving group, and X is a nucleophilic heteroatom 
or carbon anion. 

Three or Four Component Reactions 

A number of multicomponent reactions proceed under mild mixed 
aqueous conditions and are suitable for combinatorial library 
design for the purposes of this invention. One example is the Ugi 
•condensation (see Domling, A., Current Opinion in Chemical 
Biology, 6(3), 306-313 (2002); 

Ugi, I., et al., Combinatorial Chemistry, 125-165 (1999)): 



N O 



M, M3 Mf^OH S X~\ 

M{ N- 
H 



Also encompassed in the scope of the invention is design of 
combinatorial reactions in which more than one functional group 
can be present on any given monomer so that multimeric ligands 
can be assembled. In this way two or more monomers can be 
assembled by two or more functional group interconversions using 
chemistry illustrated above or other chemistry possible under 
mild aqueous conditions. For example, schematically: 
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in which the monomer sets containing groups Mi and M2 react 
together through one chemistry {A+B=X) to give trimeric products 
containing the groups Mi and M2/ and monomer sets containing Mi, M2 
and M3 react together through two chemistries (A+B=X and P+Q=Y) to 
give trimeric products containing the groups Mi and M2 and M3. 



Other factors which need to be considered in designing suitable 
combinatorial library conditions under aqueous conditions are as 
follows: 

Solubility in water of the reacting monomers may be limiting to 
the efficiency of the transformations 

Catalysis Bronsted and Lewis acid catalysis and other 
catalysts may be used to allow a transformation to proceed in an 
aqueous environment [For example Lindstroem, Chemical 
Reviews, 102(8), 2751-2771 (2002).] 

Micelles - the use of raicellar catalysts to enable the use of 
water as a solvent [For examples see Lindstroem, U. (2002)] 
Cosoivents - the use of cosolvents to enable the use of water as 
a solvent including, but not limited to, DMSO, polyethylene 
glyco-ls, ethylene glycols, methanol, ethanol, isopropanol, 
acetone and acetonitile. These cosolvents should be compatible 
with the protein crystal and may typically be used in an amount 
of up to 20% of the solvent system, which is preferred, although 
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an amount of up to 40% or higher may be possible. 
Solubilisers - the use of solubilising agents to enable the use 
of water as a solvent in reactions of organic compoundds . {For 
examples see Lindstroem, U., (2002)] 

Surfactants - the use of surfactants to enable the use of water 
as a solvent in reactions of organic compounds. [For examples see 
Lindstroem, U., (2002)] These surfactants should be compatible 
with the protein crystal. 

The above described combinatorial library synthesis and 
procedures can similarly be adapted for mixed aqueous solvent 
conditions . 

In some embodiments of the present invention, it is preferred 
that the monomers cannot substantially bind to the target 
protein, but that only the collection of compounds formed include 
compounds that show affinity for the target protein. Preferably 
the ratio of the Ki of the strongest binding ligand formed to the 
Ki of the strongest binding monomer is at least 10 to 1/ more 
preferably 100, 1000 or even 10000 to 1. 

In a preferred case, it is possible that the presence of the 
protein crystal in the solution in which the combinatorial 
chemistry takes place will exert an influence on the reactions 
occurring. If the reactions are reversible, then without wishing 
to be bound by theory, this would allow generation of 
thermodynamic products having the advantage of allowing for the 
^local enrichment' of products at the protein surface leading to 
formation of the most potent ligand possible in the library. The 
principles of this effect are described by Hue, I. and Nguyen, 
R., Combinatorial Chemistry and High Throughput Screening, 109- 
130 (2001) . If the reactions are irreversible, then without 
wishing to be bound by theory, this would allow generation of 
kinetic products having the advantage of ^ local templating' of 
products at the protein surface leading to formation of the most 
potent ligand possible in the library. The principles of this 
effect are described by Nguyen, R. and Hue, I., Angew. Chain. Jnt. 
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Ed., 40(9), 1774-6 (2001) and in ^click chemistry' e.g. Lewis, 
W,, et al.r Angew. Chem. Int. Ed., 41(6), 1053-1057 (2002). 

' Just-in- time ^ synthesis 

As 'just-in-time' synthesis is not carried out in the presence of 
the protein crystals, there is less restriction on the types of 
chemistries that" can be used to generate the collections of 
compounds for screening. However, in selecting appropriate 
starting materials and reaction conditions, it is preferred that 
the reactions do not result in a large number of by-products that 
could interfere with the screening process, and so reactions that 
do not require extraneous reagents are preferred. 

The reactions discussed above in relation to the 'in-situ' 
synthesis would be particularly suitable, but other reactions 
could be considered, for example those using methodology in which 
the reagents that catalyse or drive the synthetic conversions are 
bound onto a solid phase medium and therefore removed from the 
solution by filtration. These chemistries, suitable for use with 
this invention, are described by Ley, S.V., et al., Perkin 1, 23, 
3815-4195 (2000) and references cited therein. 

Such reactions can be carried out using synthetic procedures 
described in general organic chemistry texts such as March' s 
Advanced Organic Chemistry (Smith, M.B. and March, J., 5^^ 
edition, Wiley-Interscience, New York, 2001) or references given 
therein, and reference is also made to the texts on the practice 
of combinatorial chemistry given above. 

As in ^in-situ' synthesis, the reactions take place between 
appropriate sets of ligand precursor molecules ( ^monomers' ) , in 
which at least one substituent group is varied, to result in sets 
of monomers. Each reacting set of monomers can have as few as a 
single member, up to, for example, 40 members, although a maximum 
of about 20 members would be more usual. In some embodiments of 
the invention it is preferred that each set of reacting monomers 
comprises at least two monomers. A typical size for the 
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resulting collection of compounds would be between 5 and 1000, 
preferably between 5 and 100, with a range 10 or 20 to 50 or 70 
being preferred. 

5 The usual purification and characterisation steps which are used 
in the practice of combinatorial or parallel synthesis are not 
required in methods according to the present invention. These 
steps are viewed as essential in the conventional practice of 
these synthesis methods in order to produce compounds suitable 

10 for testing in a biological assay, as described in many of the 

references cited herein. Purification does not involve physical 
separation techniques such as solvent evaporation or removal of 
insolubles, e.g. by sedimentation, centrif ugation or filtration. 
Conventional purification methods include aqueous extraction, 

15 trituration, chromatography such as flash column chromatography 
or HPLC purification, crystallisation and distillation, although 
certain characterisation methods can also be used in 
purification- Characterisation may be carried out in a number of 
ways, including using LCMS, MS and NMR analysis. 

20 

In a particularly preferred embodiment of the invention, the 
chemistry used for the 'just-in-time' synthesis is carried out in 
a solvent suitable as a co-solvent for aqueous solutions of 
protein crystals. Such solvents include DMSO, NMP and alcohols 

25 such as methanol and ethanol (for more details, see discussion of 
cosolvents in ^in-situ' synthesis) . In this way, after 
incubation of the reaction for a suitable period of time, 
aliquots can be taken of the solution and added directly to the 
protein crystal containing solution, without the need for any 

30 purification or characterisation steps. 

Example 

The protein chosen as the target macromolecule was cyclin- 
dependent kinase 2 (CDK2) . This target has been the subject of 
35 intense study with the aim of developing inhibitors for the 

treatment of a number of human cancers, and has been crystallized 
with a number of inhibitors bound in the ATP-binding groove (De 
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Azevedo, et al., Eur. J. Biochem., 243, 518-526 (1997); 

Hoessel, R., et al., Nat. Cell Biol., 1, SO'Sl (1999)) 

The collection of compounds chosen for synthesis were based on 
the oxindole template, being a class of inhibitors already 
disclosed for CDK2 (Braroson, H.N.r et al., J. Med Chem., 44, 
4339-4358 (2001)). These ligands (AB; Scheme 1) present 
substituents in adjacent lipophilic binding pocJcets within the 
ATP binding groove and can be disconnected to monomers of 
approximately equal size and complexity (hydrazines A and isat 
B; Scheme 1) . 



Scheme 1 




A range of hydrazines (Al to A6; Table 1) and isatins (Bl to 35; 

Table 1) were chosen, so that the collection of oxindoles formed 

would present a range of functional groups to the ATP binding 
site . • ^ • ■ 
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Table 1 



Hydrazines 


Isatins 


Al 


6 


Bl 




H 


A2 


NH- 

HN 














H 


A3 


HN 

JO 


B3 




J 

10=0 

H 


A4 


6 

ct 


B4 




H 
N 

o 


A5 


NH, 
1 ' 
0=3=0 

Q 

1 

H/l 


B5 




H 


A6 


o=s=o 

Me 









The synthesis of the collection of ligands was then demonstrated 
to proceed under aqueous conditions in the presence of 20% of the 
5 co-solvent dimethylsulfoxide (DMSO), according to scheme 2: 
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Analysis by mass spectrometry of the 30 reactions in the array 
5 indicated that all reactions successfully formed the expected 
products over a 24-72 hour period and the efficiency of the 
individual reactions was assessed by LC/MS analysis (Table 2) . In 
most caseSf under these reaction conditions, the products slowly 
precipitated from the reaction solution over time. It is clear 

10 from the qualitative data presented in Table 2 that monomer A6 

did not give high conversions or high purity in the reactions and 
that some other individual reactions were also poor, however, the 
variability in absolute amount of products formed between 
individual reactions would be expected to be no more than 10-fold 

15 from these results. 
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Table 2 (% purity by peak area of product by LC/MS) 



A X B 


Al 


A2 

R^ = CI 


A3 

R* = CI 


A4 

= CI 


A5 
R^ = 
SO2NH2 


A6 
R^ = 
CI; R' 
= S02Me 


Bl 

= NO2 


10-25 


60-95 


60-95 


30-50 


60-95 


30-50 


B2 

R* = CI 


60-95 


60-95 


60-95 


60-95 


60-95 


30-50 


B3 

= 

SO3H 


10-25 


60-95 


30-50 


10-25 


60-95 


30-50 


B4 

r' = CF3 


30-50 


60-95 


60-95 


60-95 


60-95 


30-50 


85 

r= = 

OCF3 


30-50 


60-95 


60-95 


60-95 


60-95 


10-25 



R groups ^ H unless Indicated otherwise. 



Monomer studies to investigate kinetic competition of the 
5 monomers indicated that the products formed non-stoichiometric 
mixtures since it would be expected that the hydrazines and 
isatins would have varying reactivities. -LC/MS analysis of 
competition experiments (using Photodioide Array Detector 
scanning from 200-400 nm wavelengths) indicated that mixtures of 

10 the hydrazines reacted with a 10-fold deficit of each isatin gave 
less than a 5-fold excess of any individual product over any 
other and typically less than 3-fold excess. These data again 
indicated that monomer A6 tended to be disfavored but again a 
measurable amount of product resulting from reaction of this 

15 monomer with each isatin was detected (5-10% of the mixture) . 

These results show that during ligand synthesis in the presence 
of the protein all of the reaction products were capable of being 
formed at a useful concentration. Additionally, under these 
20 conditions, thermodynamic products would be expected to 
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predominate because the reactions are fully reversible. 

Analysis of the reactions, e.g. by LC-MS and isolation of the 
compounds is not required for the method of the invention, but 
5 having now established reaction conditions suitable for use in 

the X-ray screening method disclosed in this invention, it is now 
possible to change the substitution patterns present in the 
monomer sets A and B (i.e. R^-R^) and to carry out in situ 
synthesis of further libraries of compounds, not described 
10 herein, in the presence of CDK2 crystals. 

Crystals of full length (residues 1-293), human, cdk2 were grown. 
For soaking purposes crystals were transferred into a solution 
that maintained the ionic strength and precipitant concentrations 
15 of the original crystal mother liquor, but also contained 20% 

DMSO and the hydrazine and isatin reactive species (see Table 3) . 

In all cases the total hydrazine concentration was in 10-fold 
excess over the isatin concentration. 

20 The present invention also includes the use of the aforementioned 
methods for the generation of CDK2 ligands. 

Experimental Methods 

Crystals of full length (amino acids 1-298) human cyclin 
25 dependent kinase 2 (cdk2) were grown under the conditions 

detailed in (1) Lawrie, et ai., Nat. Str. Biol,, 4, 796-800 
(1997) and (2) Rosenblatt, J., et al., J. Mol. Biol. , 230, 1317- 
1319 (1993) . Crystals grown using (1) were obtained using the 
hanging drop, vapour diffusion method, at ^""C, or 18*C. 1 \xl of 
30 10 mg/ml cdk2 in 10 mM Hepes/NaOH pH 7.4, 15 mM NaCl, was mixed 
with 1 yl of reservoir solution. The reservoir solutions (1 ml 
total volume) contained (25-55) mM ammonium acetate, (10-17.5)% 
polyethylene glycol (PEG) (average molecular weight 3350) and 100 
mM HEPES/NaOH pH 7.4. Crystals grown using (2) were also 
35 produced by the hanging drop, vapour diffusion, method at 4*C. 4 
\xl of 10 mg/ml cdk2 in 10 mM HEPES/NaOH pH7 . 4 were suspended over 
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a iml reservoir solution containing (200-800) mM HEPES/NaOH pH 
7.4. 



For soaking purposes crystals grown using (1) were transferred 
into microbridges containing 20 pi of soak solution. Soak 
solutions were prepared such that the ammonium acetate, PEG and 
HEPES/NaOH pH 7.4 concentrations were identical to the drop from 
which a given crystal was harvested. The soak solutions further 
contained 20% DMSO and the hydrazine and isatin species that were 
to be reacted. Hydrazine and isatin concentrations in the ranges 
(5-20) mM (total organic) and (0.5-2) mM (total organic) 
respectively were used. Total organic refers to the fact that in 
a soak solution containing multiple isatins and hydrazines the 
combined concentration of all the hydrazines would be (5-20) mM 
and the combined concentration of all the isatins would be (0.5-2 
mM) . All hydrazines were present in equimolar concentrations and 
all isatins were present in equimolar concentrations. That is; 
if a soak solution contained two isatins and four hydrazines with 
total organic hydrazine and isatin concentrations of 10 mM and 1 
mM respectively each of the isatins would be present at a 
concentration of 0.5 mM and each of the hydrazines at a 
concentration of 2 . 5 mM . Soak solutions typically contained 
permutations of (1-5) isatins and/or (1-6) hydrazines (see Table 
1.). The generic formula for the products formed by permuting 
the reactants is given in Scheme 2 (see also Table 2.). The 
exact nature of the products formed is detailed in Table 2. 
Crystals were soaked for 3-5 days at IS'C. 

Crystals were frozen by momentarily dipping them into a cryo- 
protectant solution and then snap cooling them in liquid 
nitrogen. The cryo-protectant solutions contained (17.5-22.5)% 
glycerol and anunonium acetate, PEG and HEPES/NaOH pH 7 . 4 
concentrations that were identical to the drops from which the 
crystals were originally harvested. 

Crystals grown using (2) were soaked, and frozen identically to 
those grown using (1), except that rather than maintaining 
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ammonium acetate, PEG and HEPES/NaOH pH 7 . 4 concentrations only 
the HEPES/NaOH pH 7.4 concentration was maintained. 

X-ray diffraction data were collected from soaked crystals, 
cooled to lOOK, on a Rigaku copper rotating anode source using 
Rigaku/MSC Jupiter CCD, or Raxis IV++ image plate, detectors. 
The data were integrated, reduced and scaled using either the 
D*TREK suite (Pflugrath, J.W., Acta Crystallographica, D55, 1718- 
1725 (1999)), or MOSFLM (Leslie, A.G.W. In Joint CCP4 and EESF- 
EACMB Newsletter on Protein Crystallography , vol. 26, Warrington, 
Daresbury Laboratory (1992)), SCALA and the CCP4 suite of 
programs (CCP4: see above). An apo-cdk2 structure (DeBondt, 
H.L., et al., Nature, 363, 595-602 (1993)) was used as a starting 
point for structure refinements. The structures were initially 
segmented into 25 amino acid sections and rigid-body refined in 
CNX (see above) . The structures were then subjected to iterative 
cycles of positional and isotropic B-factor refinement in CNX , 
followed by manual rebuilding using the graphics program '"O" (see 
above) and automated ligand fitting using AUTOSOLVS®, Final 
water fitting was performed using in-house software. Details of 
representative X-ray data are given in Table 3. 
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Table 3. Representative X-ray results and experimental 



conditions 



Components 
in soaking 
solution 




A5 + B3 


A5+B5 


A5+B1 


82-^- 

(A1,A2, 
A3, A4) 


B2+ 

(Al, A2, 
A3, A4, 
A5, A6) 


IBl, B2, B3, 
B4, B5) + 
(Al, A2, A3, 
A4, A5, A6) 


*J ^.4 \— • ^ mm 


P2i2i2a 


P2x2j2x 


P2i2i2i 


P2i2i2i 


?2i2i2i 


P2i2i2, 


P2i2i2i 


a (A) 


53.68 


53. 69 


53.50 


53.73 


53.53 


53.62 


53. 69 


b (A) 


•71 .94 


71.94 


71 . 41 


72.11 


71 . 48 


71.83 


72. 41 


C 1 


71.90 


72.25 


72 .12 


72 .36 


72 .22 


71 .89 


72 . 10 


Maximal 


2 . 20 


2.25 


2.65 


2.70 


2 .20 


2.20 


2.80 




16740 


414 30 


22249 


23195 


36709 


35997 


22680 


lini au© 

\J 1 h «!• Ihm 

reflections 


14265 


13583 


8277 


3470 


14171 


14250 


6991 


Completeness 


97 .2 


98 .5 


98 .2 


98.6 


92 .7 


89.1 


96. 9 


J^nc roe 


0.088 


0.153 


0.122 


0.143 


0.057 


0.035 


0.127 


Mean i/cji 


4 . 3 


4.1 


3.8 


3-7 


6.2 


8.6 


4.8 


Highest 
resolution 
bm (A) 


o ft — 
2.2 


2.25 


2 74- 
2.65 


2.74- 
2. 65 


2 . 28-2 . 2 


2-28-2 .2 


2.95-2.8 


Comple Irenes s 


O O • «7 


97 . 8 


99 . 5 


100 


86.8 


35.2 


98.0 




0.28 


0.284 


0 . 332 


0 . 344 


0.187 


0.134 


0.258 


Mean I/al 


2 , 0 


2 . 4 


2.0 


2,1 


2.3 


3.0 


2.8 


Refinement 
















Protein 
atoms 






9^7Q 


2279 


2279 


2279 


2279 


Other atoms 
















Inhibitor 


23 


26 


27 


25 


0 


23 


23 


Water 


1 jy 


TOO 




39 


245 


229 


67 


Resolution 
range (A) 


35- 
2.2 


35- 
2.25 


35- 

2.65 


35- 
2.7 


35-2.2 


35-2.2 


35-2-8 




25 . 9 


^ A "7 

24 . / 








24-5 


21 . 1 


Rfree 


30.6 


28.9 


29.0 


28.1 


27.2 


25.5 


24 . 9 


Mean B- 
factor (A^) 
















Protein 


T7 n 
^ / . u 


^ 0 . X 




49 . 0 


26.1 


30.8 


30.6 


Ligand 


31.2 


37.3 


47 . 0 


59.7 


NA 


24.3 


25.5 


Solvent 


47.2 


42 . 8 


35.5 


50.4 


34.2 


41.5 


34.8 


Soaks 
















Isatin cone 
(irM) 


2 " 


0.5 » 


0.5 ' 


1 


1 


1 


1 (total 
organic^) 


Hydrazine 
cone (mM) 


20 


5 


5 


10 


10 

(total 
organic*) 


10 

(total 
organic^) 


10 (total 
organic*) 


Soak time 
(days) 


5 


3 


3 


3 


3 


3 


3 


Product 
present 


YES^ 




YES"* 


YES^ 


NO'' 


YES' 


YES^ 
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'Rmerge = ^3 Hh^^-Ihl / llh,il/ Where I^^j is the j'*^ 

observation of reflection h. 

^Rconv = 2, IIFJ-IFcIl / Sh IFol, where Fo and Fc are the observed 
and calculated structure factor amplitudes respectively for the 
reflection h. 

^Rfree is equivalent to Rconv. but for a 5% subset of the 
reflections not used for refinement. 

^ total organic refers to the total combined isatin, or hydrazine, 
concentration in the soak. Each component being present at a 
concentration equal to the total organic concentration divided by 
the number of isatin, or hydrazine, components in the mix. 
^Control experiments with crystals soaked for (3-5) days in 
cocktails containing either B2{20 mM) , (Bl, B2, B3, B4, B5) (ImM total 
organic'), or (Al, A2, A3, A4, A5,A6) (1 mM total organic) did not 
reveal any interpretable ligand density. Crystals soaked in a 10 
mM cocktail of (Al, A2, A3, A4, A5, A6) were repeatedly destroyed. 
^In certain instances it was found that crystals subjected to 
soaks that contained only one hydrazine and one isatin sustained 
increased damage as compared to those soaked in multi-component 
isatin-hydrazine cocktails. Isatin and hydrazine concentrations 
were iteratively reduced until soaking crystals could be 
stabilised, and crystal damage diminished. Precipitation, 
attributed to isatin-hydrazine product formation, was frequently 
observed during soaks - 

Discussion of Results 

Initial experiments were performed using reaction cocktails . that 
contained a single isatin and a single hydrazine. In these 
instances, product binding was observed. Previous studies 
(Bramson, H-N-, et ai., J, Med Chem. , 44, 4339-4358 (2001)) 
suggested that inhibitors derived from A5 (see Fig 1 & Table 2) 
should possess a relatively high degree of potency due to the 
presence of the sulphonamide group at the position in the 
ligand. A reaction cocktail containing B2 and {A1-A4) did not 
reveal ligand binding, suggesting that the chlorine substitutions 
at positions (R^-R^) did not confer significant potency upon the 
product ligands. A cocktail that contained B2+(A1-A6) did, 
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however^ reveal ligand binding; the electron density being 
consistent with product formed by the reaction of {B2+A5) . Thus 
the (B2+A5) reaction product was preferentially selected from a 
set of six possible products r lending support to the notion that 
the R^-sulphonamide group was conferring binding affinity upon the 
A5B2 ligand. The degeneracy of the product library was increased 
to 30 possible ligands using a reaction cocktail that contained 
isatins (B1-B5) and hydrazines (A1-A6) . Each isatin was present 
at a concentration of 0.2 mM and each hydrazine at a 
concentration of 1.67 mM (see Table 3). Difference electron 
density consistent with the reaction product from (B2+A5) was 
again observed. These studies suggest that the protein 
preferentially selects the A5B2 ligand from the library of 
available product ligands. The potency of this compound was 
confirmed by synthesis of A5B2 ligand. It is known from the 
literature that compound A5B2 has an IC50 of 43 nM (Bramson, et 
al., 2001) . 



